24 research outputs found

    Probabilistic Polynomials and Hamming Nearest Neighbors

    Full text link
    We show how to compute any symmetric Boolean function on nn variables over any field (as well as the integers) with a probabilistic polynomial of degree O(nlog(1/ϵ))O(\sqrt{n \log(1/\epsilon)}) and error at most ϵ\epsilon. The degree dependence on nn and ϵ\epsilon is optimal, matching a lower bound of Razborov (1987) and Smolensky (1987) for the MAJORITY function. The proof is constructive: a low-degree polynomial can be efficiently sampled from the distribution. This polynomial construction is combined with other algebraic ideas to give the first subquadratic time algorithm for computing a (worst-case) batch of Hamming distances in superlogarithmic dimensions, exactly. To illustrate, let c(n):NNc(n) : \mathbb{N} \rightarrow \mathbb{N}. Suppose we are given a database DD of nn vectors in {0,1}c(n)logn\{0,1\}^{c(n) \log n} and a collection of nn query vectors QQ in the same dimension. For all uQu \in Q, we wish to compute a vDv \in D with minimum Hamming distance from uu. We solve this problem in n21/O(c(n)log2c(n))n^{2-1/O(c(n) \log^2 c(n))} randomized time. Hence, the problem is in "truly subquadratic" time for O(logn)O(\log n) dimensions, and in subquadratic time for d=o((log2n)/(loglogn)2)d = o((\log^2 n)/(\log \log n)^2). We apply the algorithm to computing pairs with maximum inner product, closest pair in 1\ell_1 for vectors with bounded integer entries, and pairs with maximum Jaccard coefficients.Comment: 16 pages. To appear in 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015

    OV Graphs Are (Probably) Hard Instances

    Get PDF
    © Josh Alman and Virginia Vassilevska Williams. A graph G on n nodes is an Orthogonal Vectors (OV) graph of dimension d if there are vectors v1, . . ., vn ∈ {0, 1}d such that nodes i and j are adjacent in G if and only if hvi, vji = 0 over Z. In this paper, we study a number of basic graph algorithm problems, except where one is given as input the vectors defining an OV graph instead of a general graph. We show that for each of the following problems, an algorithm solving it faster on such OV graphs G of dimension only d = O(log n) than in the general case would refute a plausible conjecture about the time required to solve sparse MAX-k-SAT instances: Determining whether G contains a triangle. More generally, determining whether G contains a directed k-cycle for any k ≥ 3. Computing the square of the adjacency matrix of G over Z or F2. Maintaining the shortest distance between two fixed nodes of G, or whether G has a perfect matching, when G is a dynamically updating OV graph. We also prove some complementary results about OV graphs. We show that any problem which is NP-hard on constant-degree graphs is also NP-hard on OV graphs of dimension O(log n), and we give two problems which can be solved faster on OV graphs than in general: Maximum Clique, and Online Matrix-Vector Multiplication

    Limits on the Universal Method for Matrix Multiplication

    Get PDF
    In this work, we prove limitations on the known methods for designing matrix multiplication algorithms. Alman and Vassilevska Williams [Alman and Williams, 2018] recently defined the Universal Method, which substantially generalizes all the known approaches including Strassen\u27s Laser Method [V. Strassen, 1987] and Cohn and Umans\u27 Group Theoretic Method [Cohn and Umans, 2003]. We prove concrete lower bounds on the algorithms one can design by applying the Universal Method to many different tensors. Our proofs use new tools for upper bounding the asymptotic slice rank of a wide range of tensors. Our main result is that the Universal method applied to any Coppersmith-Winograd tensor CW_q cannot yield a bound on omega, the exponent of matrix multiplication, better than 2.16805. By comparison, it was previously only known that the weaker "Galactic Method" applied to CW_q could not achieve an exponent of 2. We also study the Laser Method (which is, in principle, a highly special case of the Universal Method) and prove that it is "complete" for matrix multiplication algorithms: when it applies to a tensor T, it achieves omega = 2 if and only if it is possible for the Universal method applied to T to achieve omega = 2. Hence, the Laser Method, which was originally used as an algorithmic tool, can also be seen as a lower bounding tool. For example, in their landmark paper, Coppersmith and Winograd [Coppersmith and Winograd, 1990] achieved a bound of omega <= 2.376, by applying the Laser Method to CW_q. By our result, the fact that they did not achieve omega=2 implies a lower bound on the Universal Method applied to CW_q. Indeed, if it were possible for the Universal Method applied to CW_q to achieve omega=2, then Coppersmith and Winograd\u27s application of the Laser Method would have achieved omega=2

    Faster Walsh-Hadamard Transform and Matrix Multiplication over Finite Fields using Lookup Tables

    Full text link
    We use lookup tables to design faster algorithms for important algebraic problems over finite fields. These faster algorithms, which only use arithmetic operations and lookup table operations, may help to explain the difficulty of determining the complexities of these important problems. Our results over a constant-sized finite field are as follows. The Walsh-Hadamard transform of a vector of length NN can be computed using O(NlogN/loglogN)O(N \log N / \log \log N) bit operations. This generalizes to any transform defined as a Kronecker power of a fixed matrix. By comparison, the Fast Walsh-Hadamard transform (similar to the Fast Fourier transform) uses O(NlogN)O(N \log N) arithmetic operations, which is believed to be optimal up to constant factors. Any algebraic algorithm for multiplying two N×NN \times N matrices using O(Nω)O(N^\omega) operations can be converted into an algorithm using O(Nω/(logN)ω/21)O(N^\omega / (\log N)^{\omega/2 - 1}) bit operations. For example, Strassen's algorithm can be converted into an algorithm using O(N2.81/(logN)0.4)O(N^{2.81} / (\log N)^{0.4}) bit operations. It remains an open problem with practical implications to determine the smallest constant cc such that Strassen's algorithm can be implemented to use cN2.81+o(N2.81)c \cdot N^{2.81} + o(N^{2.81}) arithmetic operations; using a lookup table allows one to save a super-constant factor in bit operations.Comment: 10 pages, to appear in the 6th Symposium on Simplicity in Algorithms (SOSA 2023

    How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation

    Full text link
    In the classical transformer attention scheme, we are given three n×dn \times d size matrices Q,K,VQ, K, V (the query, key, and value tokens), and the goal is to compute a new n×dn \times d size matrix D1exp(QK)VD^{-1} \exp(QK^\top) V where D=diag(exp(QK)1n)D = \mathrm{diag}( \exp(QK^\top) {\bf 1}_n ). In this work, we study a generalization of attention which captures triple-wise correlations. This generalization is able to solve problems about detecting triple-wise connections that were shown to be impossible for transformers. The potential downside of this generalization is that it appears as though computations are even more difficult, since the straightforward algorithm requires cubic time in nn. However, we show that in the bounded-entry setting (which arises in practice, and which is well-studied in both theory and practice), there is actually a near-linear time algorithm. More precisely, we show that bounded entries are both necessary and sufficient for quickly performing generalized computations: \bullet On the positive side, if all entries of the input matrices are bounded above by o(logn3)o(\sqrt[3]{\log n}) then we show how to approximate the ``tensor-type'' attention matrix in n1+o(1)n^{1+o(1)} time. \bullet On the negative side, we show that if the entries of the input matrices may be as large as Ω(logn3)\Omega(\sqrt[3]{\log n}), then there is no algorithm that runs faster than n3o(1)n^{3-o(1)} (assuming the Strong Exponential Time Hypothesis from fine-grained complexity theory). We also show that our construction, algorithms, and lower bounds naturally generalize to higher-order tensors and correlations. Interestingly, the higher the order of the tensors, the lower the bound on the entries needs to be for an efficient algorithm. Our results thus yield a natural tradeoff between the boundedness of the entries, and order of the tensor one may use for more expressive, efficient attention computation

    Further Limitations of the Known Approaches for Matrix Multiplication

    Get PDF
    We consider the techniques behind the current best algorithms for matrix multiplication. Our results are threefold. (1) We provide a unifying framework, showing that all known matrix multiplication running times since 1986 can be achieved from a single very natural tensor - the structural tensor TqT_q of addition modulo an integer qq. (2) We show that if one applies a generalization of the known techniques (arbitrary zeroing out of tensor powers to obtain independent matrix products in order to use the asymptotic sum inequality of Sch\"{o}nhage) to an arbitrary monomial degeneration of TqT_q, then there is an explicit lower bound, depending on qq, on the bound on the matrix multiplication exponent ω\omega that one can achieve. We also show upper bounds on the value α\alpha that one can achieve, where α\alpha is such that n×nα×nn\times n^\alpha \times n matrix multiplication can be computed in n2+o(1)n^{2+o(1)} time. (3) We show that our lower bound on ω\omega approaches 22 as qq goes to infinity. This suggests a promising approach to improving the bound on ω\omega: for variable qq, find a monomial degeneration of TqT_q which, using the known techniques, produces an upper bound on ω\omega as a function of qq. Then, take qq to infinity. It is not ruled out, and hence possible, that one can obtain ω=2\omega=2 in this way.Comment: 16 pages. To appear in 9th Innovations in Theoretical Computer Science Conference (ITCS 2018

    Matrix Multiplication and Number on the Forehead Communication

    Get PDF
    Three-player Number On the Forehead communication may be thought of as a three-player Number In the Hand promise model, in which each player is given the inputs that are supposedly on the other two players\u27 heads, and promised that they are consistent with the inputs of the other players. The set of all allowed inputs under this promise may be thought of as an order-3 tensor. We surprisingly observe that this tensor is exactly the matrix multiplication tensor, which is widely studied in the design of fast matrix multiplication algorithms. Using this connection, we prove a number of results about both Number On the Forehead communication and matrix multiplication, each by using known results or techniques about the other. For example, we show how the Laser method, a key technique used to design the best matrix multiplication algorithms, can also be used to design communication protocols for a variety of problems. We also show how known lower bounds for Number On the Forehead communication can be used to bound properties of the matrix multiplication tensor such as its zeroing out subrank. Finally, we substantially generalize known methods based on slice-rank for studying communication, and show how they directly relate to the matrix multiplication exponent ?

    Dynamic Parameterized Problems and Algorithms

    Get PDF
    Fixed-parameter algorithms and kernelization are two powerful methods to solve NP-hard problems. Yet, so far those algorithms have been largely restricted to static inputs. In this paper we provide fixed-parameter algorithms and kernelizations for fundamental NP-hard problems with dynamic inputs. We consider a variety of parameterized graph and hitting set problems which are known to have f(k)n^{1+o(1)} time algorithms on inputs of size n, and we consider the question of whether there is a data structure that supports small updates (such as edge/vertex/set/element insertions and deletions) with an update time of g(k)n^{o(1)}; such an update time would be essentially optimal. Update and query times independent of n are particularly desirable. Among many other results, we show that Feedback Vertex Set and k-Path admit dynamic algorithms with f(k)log O(1) n update and query times for some function f depending on the solution size k only. We complement our positive results by several conditional and unconditional lower bounds. For example, we show that unlike their undirected counterparts, Directed Feedback Vertex Set and Directed k-Path do not admit dynamic algorithms with n^{o(1) } update and query times even for constant solution sizes k <= 3, assuming popular hardness hypotheses. We also show that unconditionally, in the cell probe model, Directed Feedback Vertex Set cannot be solved with update time that is purely a function of k

    Tensors Ranks and the Fine-Grained Complexity of Dynamic Programming

    Full text link
    Generalizing work of K\"unnemann, Paturi, and Schneider [ICALP 2017], we study a wide class of high-dimensional dynamic programming (DP) problems in which one must find the shortest path between two points in a high-dimensional grid given a tensor of transition costs between nodes in the grid. This captures many classical problems which are solved using DP such as the knapsack problem, the airplane refueling problem, and the minimal-weight polygon triangulation problem. We observe that for many of these problems, the tensor naturally has low tensor rank or low slice rank. We then give new algorithms and a web of fine-grained reductions to tightly determine the complexity of these problems. For instance, we show that a polynomial speedup over the DP algorithm is possible when the tensor rank is a constant or the slice rank is 1, but that such a speedup is impossible if the tensor rank is slightly super-constant (assuming SETH) or the slice rank is at least 3 (assuming the APSP conjecture). We find that this characterizes the known complexities for many of these problems, and in some cases leads to new faster algorithms
    corecore